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ABSTRACT. The analysis of whole-sky galaxy surveys commonly suffers from the 
problems of shot-noise and incomplete sky coverage (e.g. at the Zone of Avoidance). 
The orthogonal set of spherical harmonics is utilized here to expand the observed galaxy 
distribution. We show that in the framework of Bayesian statistics and Gaussian random 
fields the Att harmonics can be recovered and the shot-noise can be removed, giving the most 
probable picture of the underlying density field. The correction factor from observed to 
reconstructed harmonics turns out to be the well-known Wiener filter (the ratio of signal to 
signal -|-noise), which is also derived by requiring minimum variance. We apply the method 
to the projected 1.2 Jy IRAS survey. The reconstruction confirms the connectivity of 
the Supergalactic Plane across the Galactic Plane (at Galactic longitude / ~ 135° and 
/ ~ 315°) and the Puppis cluster behind the Galactic Plane (/ ~ 240°). The method can 
be extended to 3-D in both real and redshift space, and applied to other cosmic phenomena 
such as the COBE Microwave Background maps. 

Subject Headings: galaxies - large scale structure of the universe - surveys - data 
analysis 

1. INTRODUCTION 

Two basic problems commonly appear in analysing the distribution of galaxies. First, 
if one assumes that the distribution of luminous galaxies samples an underlying smooth 



density field, then the discreteness of objects introduces Poisson 'shot-noise'. Second, 
incomplete sky coverage, e.g. due to the obscuration by the Galactic Plane (the Zone of 
Avoidance), is an obstacle in mapping the whole-sky distribution. In this Letter we show 
how to recover the all-sky projected density field, characterized by an assumed power- 
spectrum of fluctuations, from a galaxy survey which suffers incomplete sky described by 
a known mask. 

The recovery of a signal from noisy and incomplete data is a classic problem of inver- 
sion. A straightforward inversion is often unstable, and a regularization scheme of some 
sort is essential in order to interpolate where data are missing or noisy. In the Bayesian 
spirit we use here raw data and a prior model to produce 'improved data'. The prior model 
does not necessarily require a speculative assumption. In the context of this work we sim- 
ply require a reconstruction which obeys the constraint of the 2-point correlation function 
of the observed galaxy distribution, as derived from a smaller section of the sky. Using the 
above principles we derive a Wiener fllter (the ratio of signal to signal -|-noise), which also 
follows from requiring minimum variance (e.g. Rybicki & Press 1992). There are many 
(related) variants of this approach, including Maximum Entropy (e.g. Gull 1989). The 
reconstruction problems can be addressed within the framework of conditional probabil- 
ity and constrained realizations of Gaussian random flelds (Bertschinger 1987, Binney & 
Quinn 1990, Hoffman & Ribak 1991). This has been applied to reconstruct the density 
fleld from the observed peculiar velocities (Kaiser & Stebbins 1991, Stebbins 1993, Ganon 
& Hoffman 1993). 

A simple example of a Wiener fllter is given in Section 2, and the formulation for the 
whole-sky reconstruction in spherical harmonics is shown in Section 3. Section 4 shows 
application to the projected 1.2 Jy IRAS survey, and future work is discussed in Section 5. 
We shall give the full mathematical details and extension of the method to 3-D harmonics 
elsewhere (Zaroubi et al., in preparation). 
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2. A SIMPLE EXAMPLE OF WIENER FILTER 



Let us first consider a simple pedagogical example. Assume two Gaussian variables, 
X and y, with zero mean, {x) = (y) = (hereafter (...) denote ensemble average). The 
probability for x given y is by the rule of conditional probability 



where P{y) is a 1-dimensional Gaussian probability distribution, and the joint probability 
P{x^ y) is a bi-variate Gaussian distribution. It is straightforward to show that P{x\y) is a 
'shifted Gaussian' with the maximum probability occurring for x = -j^fy y. In the special 
case of Gaussian fields the most probable reconstruction is also the mean field. Hereafter 
we term them together as the 'optimal reconstruction'. 

Exactly the same result for the 'optimal reconstruction' is also obtained by a different 
approach, by asking for the linear filter F which minimizes the variance ((x — FyY). 
Minimizing with respect to F gives indeed F = "l^fy and x = Fy, as above. Note that 
although the results of the two approaches are identical, due to the quadratic nature and 
linearity of filter of the functions involved, the assumptions involved are quite different. The 
conditional probability approach (eq. 2) requires to specify the full distribution functions 
(Gaussians in our case). On the other hand, the minimum variance approach only considers 
the variance (second moment) of the distribution function, but assumes a linear filter F. 

Consider now the special case that y = x-\-(t, where cr is a Gaussian noise uncorrelated 
with the true signal x (hence (xcr) =0). It follows that the optimal estimator of the signal 
X given the (noisy) measurement y is 



The factor [F) in front of the measurement y is the well-known Wiener filter commonly 
used in signal processing (for review see e.g. Press et al. 1992, Rybicki & Press 1992). Note 





X 
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that it requires a priori knowledge of the variances in the signal and the noise. When the 
noise is negligible the factor approaches unity, but when it is significant the measurement 
is attenuated. 

A third approach to the problem, in which the Bayesian a posteriori probability (cf. 
eq. 2), —\iiP{x\y) = —\\iiP{y\x)-\-\iiP{x)\ = [y — / {a'^) + x'^/{x'^), is maximized 
with respect to x, also yields the same result as the other two approaches. In fact, this 
is a special case of the (log-Likelihood) minimization subject to regularization of the 
form + af{x), where f(x) is the regularizing function (e.g. f{x) = x'^ in our case, and 
in other applications taken as the 'entropy' f{x) = xlnx), and a is a Lagrange multiplier. 
We see that a regularization with a prior f{x) = x'^ is essentially equivalent to a Wiener 
filter. 

3. NOISE REMOVAL AND MASK INVERSION 

For simplicity, we shall consider here projected (2-dimensional) galaxy samples. We 
formulate our problem as follows: What are the full-sky noise-free harmonics given the 
observed harmonics, the mask describing the unobserved region, and a prior model for the 
power-spectrum of fluctuations ? 

3.1. Expansion in Spherical Harmonics 

Here we use spherical harmonics to expand the galaxy distribution in a whole-sky 
survey. This technique has been considered for 2-D samples (e.g. Peebles 1973, Scharf et 
al. 1992) and more recently for analysing redshift surveys (Scharf & Lahav 1993; Lahav 
et al. 1993; Fisher, Scharf & Lahav 1993b). Basically, the projected density fleld over Att 
is expanded as a sum: 

m=-\-l 

SiOA) = Y. E ^imYUOA), (3) 

I m= — l 

where the Yi^s are the orthonormal set of spherical harmonics. A reconstruction up 
to harmonic Imax resolves structure on angular scale of Trjlmax- The spherical harmonic 
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analysis provides a unified language to describe the local cosmography as well as the 
statistical properties (e.g. power-spectrum) of the galaxy distribution. 



3.2. Mask Inversion Using Wiener Filter 

We turn now to the more complicated problem of the harmonics with incomplete sky 
coverage. Here we consider a 'sharp' mask, in which observed regions are assigned equal 
weight, while masked regions are assigned zero weight. The observed harmonics cim,obs 
(with the masked regions filled in uniformly according to the mean) are related to the 
underlying 'true' whole-sky harmonics aim by (cf. Peebles 1980, eq. 46.33) 



V m' 

where the monopole term (/' = 0) is excluded. We have added the shot-noise (Ta in 
the 'true' number- weighted harmonics a^^'s (not in the c^^'s). The noise variance is 



different harmonics and acts as a 'point spread function' (in analogy with problems in 
image processing). 

By the rule of conditional probability (cf. eq. 1) 



where the vectors a and Cgbs represent the sets of observed harmonics {aim} and {c;m,o6s}, 
and Pg stands for an assumed Gaussian distribution function with variance and covariance 
which depend on an assumed power-spectrum. As in the simple example above one can 
now find the estimator a which gives maximum probability. This is a special case of 
a constrained realizations formalism (Hoffman & Ribak 1991), but here the formulation 
and computation are greatly simplified due to the orthogonality of the harmonics. An 
alternative approach is finding the minimum of the variance (|a — FW~"^ Cobs\'^) with 




(4) 





(5) 
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respect to a desired Wiener filter matrix F (here W"-*- stands for the 'pseudo inverse' of 
W = {WJJj"^ }). We shaU present the fuU derivation by the two approaches elsewhere 
(Zaroubi et al., in preparation), but the answer for the optimal reconstruction can be seen 
by analogy with the simple example given above (eq. 2) : 

a = FW-ico6„ (6) 

with the diagonal matrix 

Only the diagonal elements appear in the F matrix due to the orthogonality of the harmon- 
ics. We emphasize again that in the case of underlying Gaussian field the most probable 
field, the mean field and the minimum variance Wiener filter are all identical. Hence these 
different approaches are unified. 

Even if the sky coverage is Att (W = I), the Wiener filter is essential to reveal the 
optimal underlying 'continuous' density field, cleaned of noise. The correction factor is per 
/, independent of m, so in the case of full sky coverage, only the amplitudes are affected by 
the correction, but not the relative phases. For example, the dipole direction is not affected 
by the shot-noise, only its amplitude. But of course, if the sky coverage is incomplete, both 
the amplitudes and the phases are corrected. The reconstruction also depends on number 
of observed and desired harmonics. Note also that the method is non-iterative. 

The variance of the residual from the optimal reconstruction is given by: 

The scatter is independent of the estimated optimal reconstruction. In the limit of negli- 
gible shot-noise the scatter vanishes and the reconstruction is deterministic. However, if 
{o^l)th ^ (Cfl) then the statistical scatter is that predicted by the a priori cosmic power- 
spectrum {a'^)th- 
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3.3. Singular Value Decomposition 

We solve eq. (6) by writing it as Cobs = Ba, where B = WF~"^, and by applying 
a Singular Value Decomposition (SVD) algorithm (e.g. Press et al. 1992). Briefly, the 
matrix B (of arbitrary dimensions) can be decomposed as B = U diag{\i} V"^, where 
both U and V are orthonormal, and A^'s are the singular values of B. The least-square 
solution is a = V diag{l / \i} U'^ Cobs- The singular values give insight into the amount 
of useful information in the problem. To ensure stability of the reconstruction it is essential 
to set to zero A^'s which are much smaller than the maximal Xmax^ before inverting by 
'backsubstitution' (see Press et al. 1992). We flnd that in our problem of a |6| = 5° 
mask (see below) the inversion is rather insensitive to this truncation level (which controls 
the amount of regularization), but it is of great importance in the case of a larger mask 
(suggesting the need for an extra regularization). Other recent applications of the SVD 
approach in Astronomy include the analysis of galaxy spectra (Rix & White 1992) and 
helioseismology (Christensen-Dalsgaard et al. 1993). 

4. APPLICATION TO IRAS 1.2 Jy DATA 

Here we apply the method to the sample of IRAS galaxies brighter than 1.2 Jy (Strauss 
et al. 1992, Fisher 1992) which includes 5313 galaxies, and covers 88 % of the sky. This 
incomplete sky coverage is mainly due to the Zone of Avoidance, which we model as a 
'sharp mask' at Galactic latitude |6| < 5°. The mean number of galaxies is V" = 392 per 
steradian, which sets the shot-noise, (cr^). As our model for the cosmic scatter {aj)th we 
adopt a flt to the observed power spectrum of IRAS galaxies (Fisher et al. 1993a) which 
is described (empirically) by a Cold Dark Matter model with a shape parameter F = 0.2, 
and with real-space normalization specifled by the rms fluctuation in the number of IRAS 
galaxies in 8 Mpc spheres, erg = 0.7 . The Wiener fllter (eq. 7) in this case drops 

7 



monotonically with /, e.g. \^j^Jy^^f^^ ~ 0.9,0.7,0.6 and 0.3 for / = 1,10,15 and 30 
respectively. 

Figure 1 shows the reconstruction of the raw 2-D IRAS 1.2 Jy sample in Aitoff pro- 
jection (in Galactic coordinates) for harmonics 1 < / < 15. The Zone of Avoidance was 
left empty, and clearly it 'breaks' the possible chain of the Supergalactic Plane and other 
structures. 

Figure 2 shows our optimal reconstruction where we have used observed harmonics 
corn's with 1 < / < 15 (255 independent coefficients in total) and reconstructed the whole- 
sky aim^s also for 1 < / < 15. Now the structure is seen to be connected across the Zone 
of Avoidance, in particular in the regions of Centaurus/Great Attractor (/ ~ 315°), Hydra 
(/ ~ 275°) and Perseus-Pisces (/ ~ 315°), conffiming the connectivity of the Supergalactic 
Plane. We also see the Puppis cluster (/ ~ 240°) recovered behind the Galactic Plane. This 
cluster has been noticed in earlier harmonic expansion (Scharf et al. 1992) and further 
studies (Lahav et al. 1993 and references therein). The other important feature of our 
reconstruction is the removal of shot noise all over the sky. This is particularly important 
for judging the reality of clusters and voids. 

We have also compared our reconstruction with the one applied (using a Att Wiener 
ffiter) to the IRAS sample in which the Zone of Avoidance was ffiled-in by extrapolating 'by 
hand' across the Galactic Plane (Yahil et al. 1991). The reconstructions look very similar 
both visually, and by and cross-correlation measures. We also found good agreement 
in the angular power-spectrum of the two reconstructions. As another test, we have used 
as a prior model the standard Cold Dark matter model (with F = 0.5), and found that the 
reconstructions changed very little. 

As a more challenging test of the method we have also used an iV-body simulation of 
standard Cold Dark matter (so the whole 'sky' true harmonics are known) and varied the 
size of the Zone of Avoidance. We find that for mask larger than |6| = 15° it is difficult 
to recover the unobserved structure. Clearly the success of the method depends on the 
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interplay of three angular scales: the width of the mask, the desired resolution {tt jlmax) 
and the physical correlation of structure. 

5 DISCUSSION 

In this Letter we have presented a Wiener filter method of reconstructing the full sky 
galaxy distribution and removing the shot noise. We have also shown that a variety of 
statistical approaches to the problem all lead to the same optimal Wiener estimator. The 
prior assumptions only depend on the observed 2-point galaxy correlation function and the 
nature of the shot-noise. While well known in image and signal processing, the method has 
not been implemented before (to our knowledge) in reconstructing the large scale structure 
in both amplitude and phase. 

The 2-D spherical harmonics presentation and Wiener filtering approach can be also 
applied to other cosmic phenomena, e.g. to maps of the COBE microwave background 
and the HEAOl X-ray background. Klypin et al. (1992) have used a similar regular- 
ization approach to analyze their Relikt map of the Microwave Background. However, 
their Tikhonov regularization procedure does not reflect the physical correlation of the 
underlying fluctuations. 

Currently, we are developing the method further to 3-D by expanding the density fleld 
in spherical harmonics and spherical Bessel functions ji{kr) : 

P(r) - P = Y1 P'"""" J'^^""' ^) Yim{r) , (9) 

Z>0 m n 

(e.g. Binney & Quinn 1991, and Lahav 1993 for a preliminary application to the 2 Jy 
redshift survey). This expansion generalizes the Wiener fllter to handle the radial selec- 
tion function and redshift distortion in harmonics (see Fisher et al. 1993b), as well as the 
shot-noise and incomplete sky coverage. The Wiener method is not limited to spherical 
harmonics or orthonormal set of functions, and can be implemented in Cartesian presenta- 
tion as well (e.g. Hoffman 1993). Our procedure will be applied to new all-sky IRAS and 
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optical redshift surveys, and to surveys of the peculiar velocity field. The 3-D noise-free 
Plmn coefficients will allow objective (non-parametric) comparison of different surveys of 
light and mass in the local universe. 
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FIGURE CAPTIONS 



Figure 1. Harmonic expansion (1 < / < 15) of the projected raw IRAS 1.2Jy data in 
Galactic AitofF projection. Regions not observed, in particular |6| < 5 (marked by dashed 
lines), were left empty. The contour levels of the projected surface number density are in 
steps of 100 galaxies per steradian (the mean projected density is A" ~ 400 galaxies per 
steradian). 

Figure 2. A Att Wiener reconstruction of the 2-D 1.2 Jy IRAS galaxy sample, for 
Harmonics 1 < / < 15, plotted in Aitoff Galactic projection. The reconstruction corrects 
for incomplete sky coverage, as well as for the shot-noise. The assumed prior model is 
an empirical fit to the observed power-spectrum of IRAS galaxies. The reconstruction 
indicates that the Supergalactic Plane is connected across the Galactic Plane at Galactic 
longitude / ~ 135° and / ~ 315°. The Puppis cluster stands out at the Galactic Plane at 
/ ~ 240°. The horizontal dashed lines at 6 = ±5° mark the major Zone of Avoidance in the 
IRAS sample, the contour levels of the projected surface number density are in steps of 
100 galaxies per steradian (the mean projected density is A" ~ 400 galaxies per steradian). 
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